Search CORE

148 research outputs found

Partial Homology Relations - Satisfiability in terms of Di-Cographs

Author: A Brandstädt
AM Altenhoff
AM Altenhoff
AM Altenhoff
C Crespelle
C Dessimoz
DG Corneil
DG Corneil
F Chen
F Gurski
G Östlund
J Engelfriet
J Sukumaran
JG Lawrence
K Hartmann
K Trachana
M Hellmuth
M Hellmuth
M Hellmuth
M Hellmuth
M Lafond
M Lafond
M Lafond
M Lechner
M Lechner
M Ravenhall
R Dondi
RL Tatusov
RM McConnell
S Böcker
WM Fitch
Y Gao
Y Liu
Publication venue
Publication date: 03/05/2018
Field of study

Directed cographs (di-cographs) play a crucial role in the reconstruction of evolutionary histories of genes based on homology relations which are binary relations between genes. A variety of methods based on pairwise sequence comparisons can be used to infer such homology relations (e.g.\ orthology, paralogy, xenology). They are \emph{satisfiable} if the relations can be explained by an event-labeled gene tree, i.e., they can simultaneously co-exist in an evolutionary history of the underlying genes. Every gene tree is equivalently interpreted as a so-called cotree that entirely encodes the structure of a di-cograph. Thus, satisfiable homology relations must necessarily form a di-cograph. The inferred homology relations might not cover each pair of genes and thus, provide only partial knowledge on the full set of homology relations. Moreover, for particular pairs of genes, it might be known with a high degree of certainty that they are not orthologs (resp.\ paralogs, xenologs) which yields forbidden pairs of genes. Motivated by this observation, we characterize (partial) satisfiable homology relations with or without forbidden gene pairs, provide a quadratic-time algorithm for their recognition and for the computation of a cotree that explains the given relations

arXiv.org e-Print Archive

Crossref

University of Southern Denmark Research Output

Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology.

Author: Altenhoff AM
Dessimoz C
Piližota I
Wittwer LD
Publication venue
Publication date: 01/10/2014
Field of study

Orthology inference and other sequence analyses across multiple genomes typically start by performing exhaustive pairwise sequence comparisons, a process referred to as "all-against-all". As this process scales quadratically in terms of the number of sequences analysed, this step can become a bottleneck, thus limiting the number of genomes that can be simultaneously analysed. Here, we explored ways of speeding-up the all-against-all step while maintaining its sensitivity. By exploiting the transitivity of homology and, crucially, ensuring that homology is defined in terms of consistent protein subsequences, our proof-of-concept resulted in a 4× speedup while recovering >99.6% of all homologs identified by the full all-against-all procedure on empirical sequences sets. In comparison, state-of-the-art k-mer approaches are orders of magnitude faster but only recover 3-14% of all homologous pairs. We also outline ideas to further improve the speed and recall of the new approach. An open source implementation is provided as part of the OMA standalone software at http://omabrowser.org/standalone

Repository for Publications and Research Data

Directory of Open Access Journals

UCL Discovery

PubMed Central

Gene Ontology: Pitfalls, Biases, and Remedies.

Author: A Schlicker
AF Baas
AJ Vilella
AK Rider
AM Altenhoff
AM Altenhoff
AM Schnoes
C Dessimoz
C Hass
D Binns
EL Clarke
H Mi
JF Granada
JL Sevilla
M Mistry
MG Mason
N Škunca
N Škunca
NL Nehrt
P Gaudet
PD Thomas
PJ Bickel
RP Huntley
RP Huntley
RP Huntley
SY Rhee
T. Gene and Ontology Consortium
Y Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/02/2016
Field of study

The Gene Ontology (GO) is a formidable resource, but there are several considerations about it that are essential to understand the data and interpret it correctly. The GO is sufficiently simple that it can be used without deep understanding of its structure or how it is developed, which is both a strength and a weakness. In this chapter, we discuss some common misinterpretations of the ontology and the annotations. A better understanding of the pitfalls and the biases in the GO should help users make the most of this very rich resource. We also review some of the misconceptions and misleading assumptions commonly made about GO, including the effect of data incompleteness, the importance of annotation qualifiers, and the transitivity or lack thereof associated with different ontology relations. We also discuss several biases that can confound aggregate analyses such as gene enrichment analyses. For each of these pitfalls and biases, we suggest remedies and best practices

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Serveur académique lausannois

UCL Discovery

Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs

Author: A Bairoch
A Henricson
A Schlicker
ACJ Roth
Adrian M. Altenhoff
AM Altenhoff
AM Altenhoff
B Mirkin
C Pesquita
CA Wilson
Christophe Dessimoz
D Barrell
D Lin
J Huerta-Cepas
JA Eisen
Jonathan A. Eisen
K Forslund
L du Plessis
M Kimura
Marc Robinson-Rechavi
ME Peterson
NL Nehrt
P Bork
P Flicek
P Jaccard
P Resnik
PD Thomas
R Rentzsch
RA Studer
RL Tatusov
Romain A. Studer
TJP Hubbard
V Sangar
W Qian
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the “ortholog conjecture”, or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins

Public Library of Science (PLOS)

Repository for Publications and Research Data

Crossref

Serveur académique lausannois

Directory of Open Access Journals

PubMed Central

UCL Discovery

FigShare

OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more

Author: Altenhoff AM
Dessimoz C
Gilbert KJ
Glover NM
Mediratta I
Mendes de Farias T
Moi D
Nevers Y
Radoykova H-S
Rossier V
Train C-M
Warwick Vesztrocy A
Publication venue
Publication date: 11/11/2020
Field of study

OMA is an established resource to elucidate evolutionary relationships among genes from currently 2326 genomes covering all domains of life. OMA provides pairwise and groupwise orthologs, functional annotations, local and global gene order conservation (synteny) information, among many other functions. This update paper describes the reorganisation of the database into gene-, group- and genome-centric pages. Other new and improved features are detailed, such as reporting of the evolutionarily best conserved isoforms of alternatively spliced genes, the inferred local order of ancestral genes, phylogenetic profiling, better cross-references, fast genome mapping, semantic data sharing via RDF, as well as a special coronavirus OMA with 119 viruses from the Nidovirales order, including SARS-CoV-2, the agent of the COVID-19 pandemic. We conclude with improvements to the documentation of the resource through primers, tutorials and short videos. OMA is accessible at https://omabrowser.org

UCL Discovery

Reconstruction of time-consistent species trees

Author: A Tofigh
AB Kahn
ACJ Roth
AM Altenhoff
AM Altenhoff
AM Altenhoff
AV Aho
BTL Nichio
C Rancurel
C Semple
D Harel
D Hasić
GS Gray
J-P Doyon
JG Lawrence
L Li
LJ Jensen
M Geiß
M Geiß
M Geiß
M Geiß
M Geiß
M Hellmuth
M Hellmuth
M Hellmuth
M Hellmuth
M Hellmuth
M Hernandez-Rosales
M Kordi
M Lafond
M Lechner
M Lechner
M Ravenhall
M Steel
Manuel Lafond
Marc Hellmuth
MS Bansal
N Nøjgaard
PF Stadler
R Dondi
S Tao
TG Villa
W Ma
WM Fitch
Y Ovadia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/08/2020
Field of study

Background The history of gene families—which are equivalent to event-labeled gene trees—can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are “biologically feasible” which is the case if one can find a species tree with which the gene tree can be reconciled in a time-consistent way. Results In this contribution, we consider event-labeled gene trees that contain speciations, duplications as well as horizontal gene transfer (HGT) and we assume that the species tree is unknown. Although many problems become NP-hard as soon as HGT and time-consistency are involved, we show, in contrast, that the problem of finding a time-consistent species tree for a given event-labeled gene can be solved in polynomial-time. We provide a cubic-time algorithm to decide whether a “time-consistent” species tree for a given event-labeled gene tree exists and, in the affirmative case, to construct the species tree within the same time-complexity

Crossref

White Rose Research Online

A new, fast algorithm for detecting protein coevolution using maximum compatible cliques

Author: A Rodionov
A Valencia
AK Ramani
Alex Rodionov
Alexandr Bezginov
AM Altenhoff
D MacLeod
D Robinson
Elisabeth RM Tillier
ERM Tillier
ERM Tillier
F Pazos
F Pazos
GW Clark
J Felsenstein
J Felsenstein
Jonathan Rose
K Katoh
MK Kuhner
PRJ Östergård
R Jothi
RG Beiko
RM Karp
S Razick
T Sato
V Soria-Carrasco
W Li
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The MatrixMatchMaker algorithm was recently introduced to detect the similarity between phylogenetic trees and thus the coevolution between proteins. MMM finds the largest common submatrices between pairs of phylogenetic distance matrices, and has numerous advantages over existing methods of coevolution detection. However, these advantages came at the cost of a very long execution time. Results In this paper, we show that the problem of finding the maximum submatrix reduces to a multiple maximum clique subproblem on a graph of protein pairs. This allowed us to develop a new algorithm and program implementation, MMMvII, which achieved more than 600× speedup with comparable accuracy to the original MMM. Conclusions MMMvII will thus allow for more more extensive and intricate analyses of coevolution. Availability An implementation of the MMMvII algorithm is available at: <url>http://www.uhnresearch.ca/labs/tillier/MMMWEBvII/MMMWEBvII.php</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Beyond representing orthology relations by trees

Author: A Tofigh
AM Altenhoff
C Semple
Consortium T.G.O.
D Huson
D Wen
E Jacox
F Tekaia
G Jin
G. E. Scholz
J Jun
K Chen
K. T. Huber
KT Huber
L Nakhleh
LJJ Iersel van
M Hellmuth
M Hellmuth
M Lafond
M Stolzer
MS Bansal
O Mahmudi
P Gambette
P Górecki
R Tatusov
R Tatusov
S Böcker
S Willson
Y Ovadia
Y Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/11/2016
Field of study

Reconstructing the evolutionary past of a family of genes is an important aspect of many genomic studies. To help with this, simple relations on a set of sequences called orthology relations may be employed. In addition to being interesting from a practical point of view they are also attractive from a theoretical perspective in that e.\,g.\,a characterization is known for when such a relation is representable by a certain type of phylogenetic tree. For an orthology relation inferred from real biological data it is however generally too much to hope for that it satisfies that characterization. Rather than trying to correct the data in some way or another which has its own drawbacks, as an alternative, we propose to represent an orthology relation

\delta

in terms of a structure more general than a phylogenetic tree called a phylogenetic network. To compute such a network in the form of a level-1 representation for

\delta

, we formalize an orthology relation in terms of the novel concept of a symbolic 3- dissimilarity which is motivated by the biological concept of a ``cluster of orthologous groups'', or COG for short. For such maps which assign symbols rather that real values to elements, we introduce the novel {\sc Network-Popping} algorithm which has several attractive properties. In addition, we characterize an orthology relation

\delta

on some set

X

that has a level-1 representation in terms of eight natural properties for

\delta

as well as in terms of level-1 representations of orthology relations on certain subsets of

X

Crossref

Springer - Publisher Connector

University of East Anglia digital repository

Fast and robust multiple sequence alignment with phylogeny-aware gap placement

Author: A Biegert
A Löytynoja
A Löytynoja
A Löytynoja
A Viterbi
Adam M Szalkowski
AM Altenhoff
AM Szalkowski
B Paten
C Dessimoz
C Grasso
C Lee
D Robinson
DA Dalquen
G Gonnet
GH Gonnet
GH Gonnet
GW Stuart
J Felsenstein
JD Thompson
JD Thompson
JL Thorne
JM Sauder
K Katoh
M Anisimova
M Kimura
O Gascuel
O Gotoh
R Durbin
RC Edgar
S Pascarella
S Whelan
SA Benner
SB Needleman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Managing ethnic conflict : the menu of institutional engineering

Author: A Carattoli
A Mira
AM Altenhoff
B Contreras-Moreira
B Contreras-Moreira
B Sachman-Ruiz
C Camacho
DM Kristensen
DM Kristensen
EL Sonnhammer
EV Koonin
H Tettelin
H Tettelin
H Willenbrock
I Pagani
K Forslund
L Li
L Poirel
L Poirel
L Snipen
P Nordmann
P Vinuesa
RA Welch
RC Edgar
RC Moellering Jr
RD Finn
RL Tatusov
RS Kaas
S Guindon
SR Eddy
T Sekizuka
T Tatusova
TJ Johnson
WF Fricke
YI Wolf
Publication venue: Wissenschaftliche Einrichtungen. GIGA - German Institute of Global and Area Studies
Publication date: 01/01/2011
Field of study

The debate on institutional engineering offers options to manage ethnic and other conflicts. This contribution systematically assesses the logic of these institutional designs and the empirical evidence on their functioning. Generally, institutions can work on ethnic conflict by either accommodating (“consociationalists”) or denying (“integrationists”) ethnicity in politics. Looking at individual and combined institutions (e.g. state structure, electoral system, forms of government), the literature review finds that most designs are theoretically ambivalent and that empirical evidence on their effectiveness is mostly inconclusive. The following questions remain open: a) Is politicized ethnicity really a conflict risk? b) What impact does the whole “menu” (not just single institutions) have? and c) How are effects conditioned by the exact nature of conflict risks

Crossref

eDoc.VifaPol

Digital.CSIC